-
Notifications
You must be signed in to change notification settings - Fork 76
[intel] improve pitch and width constexpr folding #5489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
3102d07 to
0806403
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves constant expression folding for pitch and width parameters in Intel GPU block I/O operations. The changes introduce a more robust constant evaluation mechanism that handles multiple levels of type casts and operation folding, addressing issue #5338.
- Adds a new
tryConstEvalfunction that recursively evaluates constants through casts and foldable operations - Replaces the previous
skipTrunc-based pitch validation with the new evaluation approach - Adds width validation using the same constant evaluation mechanism
- Updates test kernel to mark stride parameters as
constexprto enable better compile-time evaluation
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp | Implements new constant evaluation helpers and applies them to pitch/width validation in block I/O conversion |
| python/test/unit/language/test_block_pointer.py | Marks stride parameters as constexpr in test kernel to support improved constant folding |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (succeeded(def->fold(results)) && results.size() == 1) { | ||
| if (auto val = dyn_cast_or_null<Value>(results[0])) | ||
| return val; | ||
| } |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fold result could be an Attribute containing a constant value, not just a Value. If results[0] is an Attribute, it should be converted to a constant Value rather than being ignored. This would cause tryConstEval to miss foldable constant attributes.
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp
Outdated
Show resolved
Hide resolved
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp
Outdated
Show resolved
Hide resolved
0806403 to
c1803e9
Compare
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp
Outdated
Show resolved
Hide resolved
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp
Outdated
Show resolved
Hide resolved
third_party/intel/lib/TritonIntelGPUToLLVM/LoadStoreOpToLLVM.cpp
Outdated
Show resolved
Hide resolved
6e73ab6 to
06cce6f
Compare
| if (results.empty()) { | ||
| if (failed(op->fold(results))) | ||
| return std::nullopt; | ||
| if (results.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the comment which explain why we want to fold again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, thanks
06cce6f to
8113e6a
Compare
This PR improves constant expression folding for pitch and width parameters in Intel GPU block I/O operations. The changes introduce a more robust constant evaluation mechanism that handles multiple levels of type casts and operation folding, addressing issue #5338.